Data Introduction

Row

Row

Data Dictionary

Variable Description
Date Month end of listed month and year
Region Which location housing data was collected for
Median.Sale.Price Median sales price of homes dependent on region and month/year
Homes.Sold The number of homes sold dependent on region and month/year
New.Listings The number of new homes listed for sale dependent on region and month/year
Inventory The total number of active listings on the market on the last day of the given time period
Days.on.Market Median number of days homes sold during the period stayed on the market before going under contract. This excludes homes that spent more than a year on the market before going under contract
Average.Sale.To.List How close the median home sales price was to its original listing price
Unrate US unemployment rate
T10Y 10-year treasury rate less 2-year treasury rate
M30Y Freddie Mac 30-year fixed rate mortgage average in the United States
UMCSENT University of Michigan: Consumer Sentiment
Sold.Inventory.Ratio Ratio (percent) = (Homes.Sold/Inventory) * 100
Market.Type When the Sold Inventory Ratio is > 20%, it is considered a seller’s market. When the Sold Inventory Ratio is < 15% it is considered a buyer’s market. And if the Sold Inventory Ratio is between 15-20% the market is said to be in equilibrium

Data Preview

Multiple Linear Regression

Row

Region

Days on Market

Unemployment

10-Yr Treasury

30-Yr Fixed

Sold:Inventory Ratio

Season

Row

Model Results

Estimate Std. Error t value Pr(>&#124;t&#124;)
(Intercept) 97.8631424 0.2793176 350.3651358 0.0000000
RegionBoston 0.9026848 0.1145817 7.8780882 0.0000000
RegionChicago -0.7496012 0.1137079 -6.5923379 0.0000000
RegionLos Angeles 1.2277050 0.1145961 10.7133231 0.0000000
RegionPhiladelphia -1.5843394 0.1141580 -13.8784833 0.0000000
RegionSeattle 0.0543675 0.1352373 0.4020159 0.6877559
RegionWashington DC 0.2080417 0.1167955 1.7812482 0.0751671
Days.on.Market -0.0245001 0.0021480 -11.4060358 0.0000000
Unrate -0.0587881 0.0199996 -2.9394608 0.0033614
T10Y -0.1815744 0.0510911 -3.5539319 0.0003968
M30Y 0.1103290 0.0357295 3.0878997 0.0020698
Sold.Inventory.Ratio 0.0383982 0.0015958 24.0619037 0.0000000
SeasonFall 0.1941660 0.0909041 2.1359429 0.0329198
SeasonSpring 0.7517623 0.0907052 8.2879767 0.0000000
SeasonSummer 0.5678624 0.0952157 5.9639569 0.0000000

Tolerance/VIF

Variables Tolerance VIF
RegionBoston 0.5713282 1.750307
RegionChicago 0.5801425 1.723715
RegionLos Angeles 0.5711848 1.750747
RegionPhiladelphia 0.5755776 1.737385
RegionSeattle 0.4101319 2.438240
RegionWashington DC 0.5498755 1.818594
Days.on.Market 0.4324246 2.312542
Unrate 0.6224803 1.606477
T10Y 0.4930446 2.028214
M30Y 0.5397531 1.852699
Sold.Inventory.Ratio 0.4200181 2.380850
SeasonFall 0.6065913 1.648556
SeasonSpring 0.5777270 1.730921
SeasonSummer 0.5528990 1.808649

QQ & Fitted vs Residuals Plots

Interpretations

Variable Description
\(R^2_{adj}\) 83.27% of the variability in Average Sale to List can be explained by our model
Region Overall region does play a significant factor in Average Sales to List. However Seattle and Washington DC do not have significant differences from the National average. Holding all other variables constant, homes in the Boston market are 0.9 points higher than the National average, Chicago is 0.75 points lower on average, Los Angeles is 1.23 points higher on average, and Philadelphia is 1.58 points lower on average.
Days on Market One day increase in days on market is associated with having a lower Sales to List of 0.02 on average
Unrate One percentage point increase in the unemployment rate is associated with having a lower Sales to List of 0.06 on average
T10Y One percentage point increase in the 10-Yr Treasury rate is associated with having a lower Sales to List of 0.18 on average
M30Y One percentage point increase in the 30-Yr fixed rate is associated with having a higher Sales to List of 0.11 on average
Sold:Inventory Ratio One unit increase in the Sold:Inventory Ratio is associated with having a higher Sales to List of 0.04 on average
Season Overall season does play a significant factor in average sales to list. On average homes in the Fall are associated with having 0.19 higher Sales to List ratios, Summer is 0.57 higher, and Spring is 0.75 higher

Logistic Regression

Row

Coefficient Plot

Final Model Summary

Estimate Std. Error z value Pr(>|z|)
(Intercept) -22.4354250 5.6706620 -3.9564031 0.0000761
Median.Sales.Price 0.0000296 0.0000092 3.1998449 0.0013750
New.Listings 0.0013742 0.0003255 4.2220734 0.0000242
Inventory 0.0000270 0.0001249 0.2161687 0.8288562
T10Y 0.9388092 0.5792355 1.6207729 0.1050664

ROC Curve

Assumptions

Row

Confusion Matrix - Training

Prediction
0 1
0 46 8
1 6 45

Confustion Matrix - Testing

Prediction
0 1
0 18 4
1 4 18

Row

Median Sales Price

New Listings

Inventory

T10Y

kNN Classification

Row

Model Tuning

Predictions Plot

Row

Confusion Matrix: Training

          Reference
Prediction Cold Cool Neutral Warm Hot
   Cold      19    6       1    0   0
   Cool       5  168      43   13   1
   Neutral    0   65     190   50   4
   Warm       0    5      17   31   5
   Hot        0    1       0    1   1
Accuracy: 66.45% 

Confusion Matrix: Testing

          Reference
Prediction Cold Cool Neutral Warm Hot
   Cold       5    5       0    0   0
   Cool       5   66      24    6   1
   Neutral    2   38      72   21   4
   Warm       0    4       6    8   0
   Hot        0    0       1    0   0
Accuracy: 57.09% 

Row

Confusion Matrix: Training

Confusion Matrix: Testing

LOESS

Column

Data

MSE Chart

Row

Span 2.5 (Best Fit)

Span 0.1 (Over Fit)

Cubic Spline

Row

Cubic Spline Graph

Row

Final Model Summary

Estimate Std. Error t value Pr(>|t|)
(Intercept) 300049.74 8616.669 34.822009 0.00e+00
ns(Date, knots = knots)1 63237.63 11196.884 5.647789 1.00e-07
ns(Date, knots = knots)2 59749.21 14204.099 4.206476 4.62e-05
ns(Date, knots = knots)3 83134.85 12722.589 6.534429 0.00e+00
ns(Date, knots = knots)4 98600.44 13542.569 7.280778 0.00e+00
ns(Date, knots = knots)5 118837.30 13099.674 9.071776 0.00e+00
ns(Date, knots = knots)6 216045.48 13271.702 16.278656 0.00e+00
ns(Date, knots = knots)7 194525.24 11013.390 17.662612 0.00e+00
ns(Date, knots = knots)8 289832.95 21954.086 13.201777 0.00e+00
ns(Date, knots = knots)9 226895.17 9780.668 23.198331 0.00e+00

Cross-Validation Plot

Residuals

Naive Bayes

Column

U.S. Unemployment Rate

10 Year Treasury Rate

30-Year Mortgage Rate

Consumer Sentiment

Column

Confusion Matrix - Training

Confustion Matrix - Testing

Column

Accuracy Plot